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Abstract 

The dependence of counts in cells on the shape of the cell for the large-scale galaxy dis- 
tribution is studied. A very concrete prediction can be done concerning the void distribution 
for scale invariant models. The prediction is tested on a sample of the CfA catalog, and good 
agreement is found. It is observed that the probability of voids is bigger for spherical cells 
than for elongated ones, whereas the probability of a cell to be occupied is bigger for some 
elongated cells. A phenomenological scale-invariant model for the observed distribution of 
the counts in cells —an extension of the negative binomial distribution— is presented in order 
to illustrate how this dependence can be quantitatively determined. An original, intuitive 
derivation of this model is presented. 
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1 Introduction 

One of the standard approaches to the study of the 3-D space distributions of galaxies 
over distances of tens of Mpc has been the consideration of the two-point correlation 
function, £(r), which can be derived both from observational data, angular (e.g. Groth 
and Peebles 1977) or redshift (e.g. Davis and Peebles 1983) catalogs of galaxy po- 
sitions, and from specific cosmological scenarios: £ is directly related to the power 
spectrum of fluctuations (see e.g. Peebles 1980). Some striking large-scale observa- 
tional results concerning big voids (Kirschner 1981, Bothun et al. 1986), ‘bubbles’ 
(de Lapparent, Geller and Huchra 1986), ‘great walls’ (Geller and Huchra 1989) or 
‘sponge-like’ topologies (Gott et al. 1989), have also proven to be relevant differential 
criteria to study this structure and do not seem to be trivially related with £ — but 
rather with a complicated integration of all N-point correlation functions. It is then 
important to address the question of extracting statistical information from the higher 
order correlation functions. 

Counts in cells, the probabilities P, to have i galaxies inside a randomly chosen 
cell of volume V , and in particular the void probability Pq, are related with higher 
order correlations as is clear e.g. from (6) below — and have received increasing 
attention as a tool to study 3-D redshift catalogs from data, and to compare them 
with simulations or theoretical predictions (see e.g. Fall et. al 1976, White 1979, 
Sharp 1981, Fry 1984, Ryden and Turner 1984, Saslaw and Hamilton 1984, Fry 1985, 
Hamilton 1985, Schaeffer 1985, Fry 1986, Otto et al. 1986, Bouchet and Lachieze- 
Rey 1986, White et. al 1987, Maurogordato and Lachieze-Rey 1987, Mellot 1987, 
Balian and Schaeffer 1988, Elizalde and Gaztanaga 1988, Fry et. al. 1989, Balian and 
Schaeffer 1989, Elizalde and Gaztanaga 1990, Maurogordato and Lachieze-Rey 1991, 
and references therein). The void probability Po is of special interest, since it generates 
all the other counts in cells P { (see White 1979), although is not trivially related to the 
observed big voids, because they are not statistically significant (see Otto et al. 1986). 
The void probability contains higher-order correlation functions, generates counts in 
cells and provides an easy way to test scale-invariant models. We would like to study 
in this paper the dependence of the void probability and of the other counts in cells 
on the shape of the cell. This dependence might be appropriate to address questions 
such as whether elongated or spherical cell occupancy is statistically more significant. 
Furthermore, the analysis of the shape dependence might be useful when cell counts 
must be binned in non-spherical cells for practical purposes. 

We shall concentrate on a phenomenological scale invariant model for counts in 
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cells: an extension of the negative binomial distribution. It is very simple and seems 
to reproduce fairly well the observed redshift distribution. We first present (in Section 
2) an original and intuitive derivation of this distribution. In Sections 3 and 4 we 
compare the model with a sample from the CfA redshift catalog. We will argue in the 
discussion (Section 5) that the analysis presented below can be applied to any scale 
invariant model. For scale invariant models the shape dependence will be proven to be 
fixed by the volume dependence. 


2 The negative binomial distribution 

A continuous generalization of the negative binomial distribution, see (17) below, which 
was originally applied to hadron multiplicities at high-energy colliders (Carruthers and 
Shih 1983), has been used by Fry (1986) and Fry et al. (1989) as a phenomenological 
illustration of a model with a scaling relation on N-point correlation functions. This 
scaling property seems to be related with the so-called hierarchical universes. Although 
no physical or intuitive explanation have yet been given, the negative binomial distri- 
bution does provide a fair agreement with the observational distributions (Fry et. al 
1989) of the Giovanelli-Haynes (1985) catalogue over Pisces-Perseus. As pointed out 
by Fry et al. (1989), the agreement —as shown in the scaling function X— 18 not 
perfectly accurate but can be used as a first analytical approximation to the observed 
galaxy distribution. It is not clear to us whether the small, but systematic, discrep- 
ancies between the model and the observational counts correspond to real differences, 
since systematic errors coming from peculiar velocities have not yet been estimated. It 
is well known that peculiar velocities produce spurious effects on the estimate of the 
two point correlation function t from redshift catalogs (e.g. Davis and Peebles 1983, 
de Lapparent, Geller and Huchra 1988). This, by itself, affects the analysis by Fry et 
al. (1989), because the scaling function x is studied as a function of £, which is directly 
extracted from data, but is not corrected from peculiar velocities. One might also ex- 
pect further distortions, coming from peculiar velocities, on higher order correlations, 
and thus on P 0 . These effects are by no means small, specially when samples are close 
to a big cluster (like Pisces-Perseus), since they can distort small scale into large scale 
statistics. In fact, the anisotropies presented in Section 4, below, might be caused by 
peculiar velocity distortions. 

Bali an and Schaeffer (1989) compiled different values for Pc obtained from different 
catalogs, including the ones from Fry et al. (1989) of Pisces-Perseus, and the ones 
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from Maurogordato and Lachieze-Rey 1987 of the CfA redshift catalog (Huchra et al 
1983), showing that the scaling function, x = -(log P 0 )/nV, with n the density and 
V the cell volume, can be fitted to a power law: x ~ being ( an average 

over £ [see (19) below] and u> a parameter fitted about w ~ .7. This same compilation 
also fits the negative binomial expression: x = [l°g(l + nv^)]/nv^ which has no free 
parameter. As pointed out by Balian and Schaeffer (1988 and 1989) the agreement 1 
obtained for different catalogs and samples, with different luminosities and densities, 
is highly non trivial and provides a good verification of scale-invariant models. 

We have independently found a good agreement with the negative binomial distri- 
bution, which we called the quasi-Poisson model, both in an analysis over the CfA2 
slice sample from the Lapparent, Geller and Huchra (1986) and Huchra et al. (1990) 
redshift catalog (Elizalde and Gaztanaga 1988) and over a sample from the Huchra et 
al. (1983) CfAl redshift catalog (Elizalde and Gaztanaga 1990). 

The original discrete negative binomial distribution is well known from standard 
statistics (see e.g. Eadie et al. 1971). It accounts for the probability, P^, of the number 
i of trials necessary for m successes to occur, the events being independent and having 
each a probability p of success: 

Pi = feii)p m (i-pr m , (i) 

with i > m. If we use a new variable, defined as a = i — m, we have the (also common) 
expression 

P' = (r-') p m (i - pf- (2) 

with s = 0, 1, ...♦ It is this last form the one that leads to the distribution under study 
when generalized to continuous values of m. Indeed for m = 1/f and p = 1/(1 + nVl) 
it is easy to reproduce formula (28) of Fry (1986), or the equivalent expression (17) 
below. After performing these changes one completely looses the original simple in- 
terpretation of the discrete negative binomial distribution. In order to recover such 
an interpretation we summarize below our original and independent derivation of the 
binomial distribution, which was constructed from a simple and intuitive statistical 
model (we called it quasi-Poisson) for the box distribution of a sample of galaxies. It 
has not been until recently that we have realized that our quasi-Poisson model has the 
same distribution of counts in cells as the binomial distribution (2). The quasi-Poisson 
model is, however, much more concrete, because we also have explicit expressions for 
the distribution in boxes, the variance and the higher-order momentums correspond- 
ing to the cell counts. In Elizalde and Gaztanaga (1990) a different derivation of the 


1 In particular, the fact that all data yield \ as a function of nV £ only. 
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model was presented. There, the quasi-Poisson distribution was obtained directly from 
an initial guess for the configurational probability, F N ~ III 1 + the same as ex ' 

pression (15) of Politzer and Wise (1984) for bias statistics (Bardeen et al. 1986). This 
alternative derivation was only valid under a very restrictive, low-density condition, 
where the cell occupancy is strictly less or equal than one galaxy per cell. 


2.1 Counts in cells 

Let us first introduce some concepts and notations. The problem of characterizing 
the large-scale structure of the universe can be formally solved, statistically, from the 
knowledge of the (configurational) probability of having a certain spatial point dis- 
tribution r u ...,r N . This can be shown to be equivalent to the knowledge of all the 
iV-point correlation functions, &r(n, ...,r N ) (Peebles 1980), which are direct observa- 
tional quantities, although impossible to compute in practice for N > 3. In general, 
evaluations are restricted to N = 2, where there are already big uncertainties and, even- 
tually, they reach N = 3 (Groth and Peebles 1977) or N=4 (Fry and Peebles 1978), 
which are known to be very elusive correlation functions, specially for redshift surveys. 
The correlation functions can be also related with the counts in cells Pi{V c ), the prob- 
abilities of having i galaxies inside an arbitrary cell of volume V c . These probabilities 
are easy-to-measure observable quantities that contain statistical information about 
the higher-order correlation functions. The connection is performed by introducing a 
generating function 

G{ A)a£*A\ ( 3 ) 

t 

which can be shown to be (see e.g. White 1979, Fry 1985, Fry 1986, Otto et al. 1986, 
Bali&n and Schaeffer 1988): 

0( A) = exp [ £) n "^ ‘ ~ 1)W f irt- J v dr N ( K (r u ...,r N ) , (4) 

where n is the system density, dr is the differential of volume and V c is the cell volume. 
Moreover, & = 1. By taking convenient derivatives of G(A) and by evaluating them at 
X = 1, one can easily find the relations between the Pi and the (n- For instance, the 
second derivate gives 

Y, i2p i = ( nV c ) 2 + nV; + n 2 J v J y (2(ri,r 2 )dr 1 dr 2 , 


( 5 ) 
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which is the well known result that the two-point correlation function, ( for brief, gives 
the system number density fluctuations. It is also interesting to notice that for A = 0 
we have 


Po = exp 




uv=i 


N\ 


I dri... / drt/( N (ri, 
"Vc ** v c 


( 6 ) 


so that Po depends, in principle, on all order correlation functions. Notice in particular 
that for an uncorrelated system all the are zero (except for £ 1 ), and we get the 
Poisson result exp(— nV c ). We see also that Po is obtained in terms of a complicated 
integration of all N-point correlation functions. 


2.2 Distribution in boxes 


To introduce the binomial model let us consider a more restricted characterization of 
a point distribution. Divide the sample of /i points into m identical cells of volume 
V c = V/m and of a given shape, and consider the probability km) of having 

particles in cell 1, k 2 particles in cell 2, and k^ particles in cell m. For the 
uniform distribution, where any point has the same probability of being in any cell, 
the probability of the configuration fcj, — , Ar/ — 1, k m is related with the previous one 
by 

l m 

i> •••> km, N) = ^2 Pp-i (&i * •••? ki — 1 , km \ N ). (7) 

771 i=i 

This gives the uniform distribution; for fi = jV, 




ml 


m 
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1 

m kl 


ml 


k \ ! . . . kyji ! 



( 8 ) 


What we have called the quasi-Poisson distribution is defined by considering the proba- 
bility that a point belongs to a particular cell to be proportional to the number of points 
which are already occupying this cell. The proportionality constant will be called g. 
It may depend on the volume and on the shape of the cell, and can be interpreted as 
a kind of attraction /repulsion parameter. The value g = 0 represents no interaction 
and reproduces the uniform distribution. With these considerations, the recurrence 
relation corresponding to (7) is in this case 


P*(bu N) — C ^ •••> ki ~ !?•••> N) [l + {ki — 1)#] , (9) 


i 
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with C a normalization constant that can be easily computed, what leads to 

P n( k W) = ]C •••»&* ~ -f (iV - m)^' ^ 


2.3 Explicit solution for the distribution 

We have been able to solve the recurrence (10) analytically. In Appendix A.l we show 
that 

KiK) - w + ta&k+i ~ y (u) 

This result reproduces the uniform distribution (8) when g = 0. Once g is known, 
all the information about the system in the configuration space is obtained. We can 
compute, in particular, the expectation value for the number of particles in a cell and 
also its fluctuation. This gives a directly observable magnitude which can therefore be 
checked experimentally: the counts in cells. With the aim of computing this average, 
let us first elaborate on (11) in order to obtain from it the corresponding distributions 
of the usual random variables the number of cells with i points inside* In Appendix 
A.l it is found that 

ml N\ T(m/g ) n£i[r( l/g + «)]* 

Out of this expression it is possible to calculate the mean value E(x{) and the 
variance V(x t ) associated with this random variable. This is done by adding up the 
corresponding contributions from (12) 

E(Xi) = 51 XjPff(Xo,-‘;Xj,...,X N ). (13) 

In Appendix A. 2 this is seen to give 

E(xi) = A(m,N,i), (14) 


where 


, mU9+ i)T(m/ s )T(^i+N-i) 

‘ 4( ’"’ * 9 r(l/j + l)V(m/g + N)T(^) 


(15) 
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is a convenient expression introduced in order to express the different characteristics of 
the probability distribution in a simple way. For instance, its variance can be written 
as 

V(z<) = A(m,N,i)[l + A(m-l,N-i,i)-A{m,N,i)]. (16) 

These results can be tested in the limit g —* 0, where they actually reproduce the 
results for the binomial distribution — since g = 0 means no interaction. 

If we now take the continuum limit for the probability of finding t points in a certain 
cell, Pi = E(xi)/m , we get 

Pi(V c ) = ^^(1 + gnK)- 1 '^ n(l + gj), (17) 

l! i = i 

which also reproduces the Poisson distribution for g —* 0 . This is precisely the so-called 
negative binomial distribution, presented in Section 2. 

The distribution (17) above can also be tested by calculating i 2 Pi directly, which 

in Appendix A. 3 is shown to yield 

£i 2 P,- = ( nV e ) 2 + nV c + g(nV c ) 2 . (18) 

t 

By comparing with (5), the value of g is completely fixed: 

<KK) = t^/ / i(ri,r 3 )d ri dr 2 , (19) 

which happens to be a well-known quantity, usually referred to in the literature as ( 
(cf. e.g. Peebles 1980, Fry 1986, Balian and Schaeffer 1988) and is also related to the 
known J 3 : ( ~ Notice that one of the integrals in (19) can be readily performed 

using the fact that £(ri,r 2 ) = £(rx — r 2 ), but the second one is restricted to some 
different contour and will depend on the shape of the cell. 

It is important to observe that, in addition to the new interpretation of the binomial 
distribution, we have been able to derive a very concrete model for the box distribution, 
& 2 j .-*j Kn] TV), eq. (11), a discrete version of the counts in cells for a finite sample, 
eqs. (14) and (15), and its variance, eq. (16). 


3 Comparison with data 

We now turn to the crucial point of comparing this information with real data. In 
doing so, we face a rather strong imposition from the theoretical model, namely that 
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the expression of counts in cells (17) represents a very concrete prediction without any 
free parameter to be fitted , provided we compute the fluctuation f directly from the 

data. 

We have checked counts in cells in a redshift survey using a volume-limited sample 
from the completed North Zwicky Forty CfA catalog, m B < 14.5 (Huchra et al. 1983). 
We will focus on a sample with M B < -20 and v < 8000 Km/s. Cell counts Pi have 
been computed by placing a random cell on the sample and by counting the number 
of objects inside the cell. This has been repeated a number of times for 40,000 to 
4,000,000 cells, depending on the statistical uncertainties, and then we have recounted 
the number of cells having a given number, t, of galaxies inside. This provides our 
result for P x . Errors are estimated by going through different independent samples. 

Boundary problems have been faced up by choosing only ceUs which fell completely 
inside the sample. This has introduced a bias, because some zones of the sample (the 
central ones) are overweighted. The effect is bigger for bigger cells (for spherical 10 
Mpc cells the contour zone which is underweighted represents 80% of the sample) and 
thus values for big cells contain inescapable uncertainties. 

We have used the standard known value for the correlation function: ((r) ~ (a/r) 7 , 
with 7 ~ 1.8 and a ~ 5 Mpc ( H 0 = 100 Km s^Mpc" 1 ) for the 14.5 m B CfA catalog 
(Davis and Peebles 1983), to estimate <?(F C ) (or £) in (19). The two- variable probability 
observational distribution of counts in cells Pi(R) is automatically fitted with the neg- 
ative binomial distribution (17) for all significant values of R and i. The fitted curve is 
shown in Fig. 1, where for the sake of clarity we have plotted observational values only 
for voids, P Q , but for different cell sizes, compared with theoretical values for different 
7 and a corresponding to different uncertainties in the knowledge of (. The average 
fitted values, that is a = 5 Mpc and 7 = 1.8, perfectly match with the observations. 


4 The shape dependence and anisotropies 

An interesting characteristic of counts in cells is their ability to provide a good measure 
of some statistical shape features. When the shape of the cell is no longer an sphere 
but rather e.g. an ellipsoid of a given fixed volume but of varying shape, information 
about the way galaxies choose to cluster can be easily extracted. We can, for simplicity, 
characterize an ellipsoid by its three half-axes: a,i,c with the restriction: abc = R 3 , 
and take one of them fixed and the other two: yfsR and R/ y/s, so that the excentricity, 
s, and the radius, R, characterize the shape of the cell F c (.R,a). If clusters or groups 
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tend to appear with elongated shapes, counts in cells Pi(R, s') will be increasing with 
a (a > 1) — up to a maximum value of a — and if voids are statistically more probable 
to be spherical, P 0 (R,s) will be decreasing with a. Since the probability, Pi, must be 
normalized when summed over all values of i , there is a relation between these last 
effects: if the void probability decreases with increasing a, the occupancy probability 
must increase. 

For the negative-binomial model the shape dependence comes from the integration 
of £ in (19) since, for a given volume, different shapes will contribute to £ with different 
numerical factors. Formally, £ is then a function of R and a, £(J2, a). We can introduce 
a shape factor t/(R,s), 

£(R,s) = V (R,s)£(R, 1) (20) 

so that, t){R, 1) = 1. For ellipsoids this factor is a decreasing function of the excentricity, 
a, as can be seen in Fig. 2, where the values of £ are obtained as a function of a, by 
numerical integration of (19), using a power law for (. The corresponding effect on Pi 
is shown in Fig. 3, where the values of Pi coming from eq. (17) are shown as a function 
of a, for a given V c . We see that there is a bigger probability to find spherical voids 
and elongated clusters than elongated voids and spherical clusters. 

Let us now compare, as in the previous section, these predictions with observational 
data, just by performing the same counts for elongated cells. We find the striking result 
that there is a big anisotropy in the catalog, because Pi(R , a) depends strongly on which 
one of the axes is chosen to be fixed, or, in other words, on the precise plane where 
we perform the elongation, and also on the orientation of the ellipsoid’s section inside 
this plane. In Fig. 4 we have plotted different probabilities for P 0 , extracted from the 
catalog, varying with a and for six differently oriented ellipsoids — corresponding to 
three different frames of orthogonal directions for the fixed axes or plane. Inside each 
plane we choose two distinct perpendicular orientations. 

Very different values are found for the various orientations; the differences between 
them increase with excentricity, that is, with the main scale involved. However, some 
warnings should be made concerning this anisotropy. As pointed out before, there 
exist inescapable uncertainties due to the contour, and this effect also increases with 
the scale, just as in our anisotropy. The contour forces us to center the cells in a 
certain inner region which represents only a 10% of the total volume, for R = 5 Mpc 
and s=5 (the main distance involved is 25 Mpc). Anisotropies in this central region, 
where the Coma cluster is located, highly contribute to the effect. There is another 
point to be taken into account, which is the fact that the contour is not symmetric 
and, thus, the mentioned central region is not the same for every direction. Peculiar 
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velocity distortions coming from the Coma cluster (the “finger of God effect”) are, very 
possibly, responsible for this effect. Actually this anisotropy had already been observed 
(Elizalde and Gaztanaga 1988) in a slice catalog, almost included in the one we are 
dealing with: the first CfA 2 mg = 15.5 slice (de Lapparent, Geller and Huchra 1986, 
Huchra et al. 1990) which containts the Coma cluster. 

We have performed estimations of counts in cells with different corrections in the 
redshifts, to take into account the local motion of our galaxy. No significant differences 
in the results have been found when data is corrected either from the MWB inertial 
frame, the local group frame or galactocentric velocities. 

The above is a new result that can be perfectly measured quantitatively via the 
Pi(R y s y a ), where a stands for the orientation angles, but interferes somehow with 
our original purpose of checking the dependence with the shape. Although a detailed 
analysis of the anisotropy dependence has not yet been performed, we can average over 
all the orientations in order to obtain an estimation for the Pj(i2,s). The real average 
would probably have some weight which we do not know and thus, we can only expect 
to find some fit within the standard deviations. 

In Fig. 6 error bars are compared with predictions from the model using the 
corresponding average on (19) and good overall agreement is found, within the error 
bars, again without having to adjust any parameter . 


5 Discussion and conclusions 

As mentioned in the introduction, our analysis of the shape dependence can be extended 
to very different models, being the negative binomial distribution an appropriate phe- 
nomenological tool that can be used to illustrate it. 

In general, the dependence of P$ on the shape of the cell can be easily obtained 
for any scale-invariant model, for which we already know the volume dependence. For 
the scale invariant models of the galaxy distribution, it is already known (Fry 1986, 
Bali an and Schaeffer 1988) that x = -(log Po)/nV has to be a function of the combined 
variable nV( alone. This statement is true for a cell of any shape, so that the only 
way the shape can affect the probability Po is through £. We can use, for example, 
elipsoids as cells and characterize their shape, for a fixed volume, by the value of the 
excentricity s — for spheres 5 = 1. The dependence £(s) is obtained, from a known 
correlation function, by performing the integration in (19), which will introduce 
and additional factor, 77 ( 5 ), coming just from the shape, so that, £(s) = v( s )({ 1 ) ( see 



12 


e q. (20)). Thus, given a sample where the correlation function and the functional 
dependence of x ( or P 0 ) on nV are known for a given shape, e.g. for s = 1, one can 
easily predict the values corresponding to other shapes, since this just requires the 
appropriate scaling of nV. Formally 

x[nVf(s)] = xfnF'&l)], (21) 

where nV 1 = nVrj(s). In the previous sections it has been illustrated how to do this 
prediction for a concrete analytical distribution, but our whole study is completely 
general, and can be reproduced, numerically, without referring to the binomial distri- 
bution. 

It is concluded then, that the shape dependence of Pq (or x) can be predicted from 
the values which are obtained for a given fixed shape, provided that the distribution 
is scale invariant. Although the analysis presented here is quite restricted (only one 
sample has been considered) and affected by large anisotropies (coming probably from 
the Coma cluster), a good qualitative agreement has been observed. We propose this 
type of analysis as a further way to test the functional dependence of x on nV£, x = 
£) and, thus, to study the scale invariant characteristics of the galaxy distribution. 
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A Appendix 


A.l Configurational probability into boxes 

We would like first to show how to solve the recurrent relations in (10): 


/Mil,...,*-; It) = jrp„_,(k u ..„k, - 1 , AT) . 

fli m + \p- m)g 


( 22 ) 


This can be done by applying the recurrence several times, one after the other. It 
is convenient to realize at this stage that fi = k q , since this is the total number 

of points to be distributed into the m cells, with occupation number k qy q = 

If we apply (22) two successive times, and take into account that fi must be replaced 
by fi — 1 (since one of the // total points is already fixed in ki) y we get 


P tl (k 1 ,...,k m ]N) = ^2 X) fci - 1, l,...,A: m ; iV) 

»=1 i=l 

1 + (fet ~ 1 1 + (fct - l)g 

m - g + gYZL i m - 2g + 


(23) 


Going on with the recurrence and re-expressing the summations in terms of prod- 
ucts, we arrive to 


Ppr(ki, N ) 


AT! 1 + (!-!)» 

kil.'.kml J=:1 j =1 m + <7(£i=i k{ + l — 1) 


(24) 


which is indeed eq. (11). 

In terms of the variables Xi (number of cells which have i points inside), this ex- 
pression can be further elaborated, in the form, 


Pft(ki, k m ; N) 


N\ 




(0!)-(l!)i...(tf!)» n&v + h) 

from which we get the probability density in terms of the 


P{x o, N) 


ml Nl 

x 0 \xi\...xn\ (0!) x °(l!) x ‘...(7V!) x “ 

r (Wg) n/L r(i/g + /) xi 

r(m /5 -)- N) g m - x °[T(l/g + l)]m-(«o+«i) 


(25) 


(26) 


this is the same as eq. (12). 
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A.2 Expectation value and variance 

Let us now calculate the mean value E(xi) and the variance U(xi): 

E(xi) = 53 x j P N {x 0 ,... y x j ,...,x N ), 

(]T *,=*>) 

V{xt) = E{xD - E(xi)\ 


for expression (12). This is computed by noticing that 

53 >*w) = !> 

(2D*y= m ) (Ej*i =JV ) 

and that, for any Xj > 1, 

5 ^iv(*o» •••> *j !»•••> *w) 
(£*, ="*-*) (I ]jxi=rt-i) 

Use of these expressions allows us to write 

E(xi) = 53 XiP(x 0 , *»j •••jXjv) 


and replacing (12), we get 

E( Xi ) = 


mN\T{l/g + i) 


(m — 1 )! 


1 ) ^ x 0 !...(*i — 1 )!..®jv-<! 


g(N-i)\ i\T{l/g + 

( N-i )! r(m/ 9 ) nS'r(i / 9 + i)’ 


r ( m /9 + N ) 5 ”-‘--lr(l/s + i)]»-i-(«+*i) 

By calling 

*i = *t 

and by writing the sum which appears in (31) as 

_ 5^ P(xoi ..., Xi , ..., x n ) = 1. 

(xo+..+*i+..+*l»=m-lM*l+"+**< + " + ** =JV- ’) 

we arrive at the very simple expression 

mJVT(l/ 5 + *)r(m/^)r(=fi + N-i) 


E(x t ) = 


g{N - i)\i\T(l/g + l)T{m/g + N)T(*f)' 


(27) 

(28) 

(29) 

(30) 

(31) 

(32) 

(33) 


(34) 
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This can be written in the form 


where 


E(xi ) - A(m,N,i), 

_ t „ s T(l/g + inm/g + l)T{*=± + N-i) 


(35) 


4(m, N, i) - (?) r (i/ 5 + l)r(m/(/ + NjTi^j 1 ) ^ 

is a convenient expression introduced in order to write in a simple compact way the 
different characteristics of the probability distribution. A similar calculation for the 

variance yields 

V(xi) = A(m,7V,i)[l + A(m- 1, AT -*,*)- A{m,N,i)]. (37) 


A.3 Fluctuations 

We now want to obtain g in (17), by first calculating and by comparing then 

with (5): 


We start from (17) 


with 

'£P i i = nV c . 

i 

Differentiating (40) with respects to n, we get, 

V— i = V 

~ dn 
1 

On the other hand, by computing ^ from (39) 

dPj . Pi . PigVc 
dn 1 nV c (l + <7fiF c )’ 

and substituting in (41), we get 

£ i 2 Pi = ( nV e ) 2 + nV c + n*V?g. 


/ / iidr 1 dr2. 

JV' Jv c 

(38) 

-n(i+ 9 ), 

i=i 

(39) 

• 

(40) 


(41) 


(42) 


(43) 
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Finally, comparison of this equality with (38) yields the expression for g we were looking 
for 

9 = V?Jv c Jv e tt ri ' r *) dridr2 - ( 44 ) 

This is eq. (19). 
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Figure captions 

Fig. 1: Values of P 0 (in %) for cells of different radii: R = 1.5 - 7.5 Mpc. Dashed 
circles represent observational values, the dashed line is the theoretical value for a = 5 
Mpc and 7 = 1.8, and the continuous upper and lower lines are the theoretical values 
for a — 4 Mpc and 7 = 1.6, and a = 8 Mpc and 7 = 1.85, respectively. 

Fig. 2: Values of the averaged two-point correlation function, £, normalized to the 
value of ( for a sphere, are computed numerically for a power law correlation function, 
£, and plotted as a function of the excentricity of the cell, for a fixed volume, over 
which £ is averaged. 

Fig. 3: Values of counts in cells, Pi, for i = 0,1, 2, 3, are plotted for the negative 
binomial model as a function of the excentricity of the cell for a fixed volume. 

Fig. 4: Variation of P 0 (in %) with the excentricity, s = 1-5, with fixed R = 5 Mpc, for 
different orientations. Lines with equal type of dash represent two different orientations 
in the same plane. Different dashes correspond to three different orthogonal choices of 
the plane of elongation. 

Fig. 5: Values of P 0 (in %) for different shapes of the cell, parametrized by its excen- 
tricity s = l-5, being R = 5 Mpc. The error bars correspond to observed deviations 
with orientation. The continuous line corresponds to the theoretical prediction, with 
a = 5 Mpc and 7 = 1.8. 


Fig. 6: The same as in Fig. 5, for P 2. 







